[poster] Expanding Hubverse Evaluation Metrics and Dashboard Support by nickreich · Pull Request #34 · reichlab/decisions

nickreich · 2026-02-27T18:52:43Z

Summary

This PR adds the project poster for expanding the hubverse forecast evaluation ecosystem and fixes a README inconsistency.

Project poster (`project-posters/eval-metrics-expansion/`)

Five mini-sprints: UI polish (A), config-driven enhancements (B), scale transformation pipeline (C), variogram score / sample scoring (D), developer documentation (E)
Includes AGENTS.md for AI agent / contributor onboarding context
Sprint ordering and dependencies documented; each sprint is independently releasable

README fix

Corrects the poster creation instructions to use project-posters/<project>/ instead of posters/<project>/, matching the actual convention used by all existing posters in the repo.

Review timeline

1 week suggested.

🤖 Generated with Claude Code

Adds the project poster for expanding hubverse evaluation metrics and dashboard support (five mini-sprints: UI polish, config-driven enhancements, scale transforms, variogram score, documentation). Also corrects the README poster instructions to use `project-posters/` instead of `posters/`, matching the actual convention in the repo. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

project-posters/eval-metrics-expansion/AGENTS.md

seabbs

This looks good to me. variogram has landed on scoringutils main already and there should be a CRAN release this week. We expect some small changes to i.e documentation to make it easier to use but nothing breaking.

project-posters/eval-metrics-expansion/eval-metrics-expansion.md

annakrystalli · 2026-03-03T14:34:20Z

project-posters/eval-metrics-expansion/eval-metrics-expansion.md

+**hubPredEvalsData changes:**
+- Add `transform_defaults` (top-level) and per-target `transform` to `inst/schema/v1.1.0/config_schema.json`
+- Allowed transform functions: `log_shift`, `sqrt`, `log1p`, `log`, `log10`, `log2`
+- `append: true/false` — when true, scores.csv gains a `scale` column (`"natural"` or transform label)


Given you want a wide table, It might be good to think through whether hubPredEvalsData should also output a wide table? This would make the table easier to present, not sure about the evals visualisation? Would be great to hear @matthewcornell 's thoughts on this

Sorry, I don't understand enough to answer yet. Is there a summary of the specific changes you're asking about? I haven't touched any of the score-loading/manipulating code in predevals at this point.

I think we want to treat scores on a transformed scale as a separate score consistently throughout. This means, I think, treating it as a separate column, which will mean new columns in the tables and new variable names in menu selectors for the plots. So maybe this needs to be updated so that the scores.csv would return new columns for the transformed scores, not a new column for scale?

@annakrystalli are you saying that hubPredEvalsData currently outputs a long table but that we might want to change it to output a wide table given the requirements that we want the eventual table to be displayed in wide format?

project-posters/eval-metrics-expansion/eval-metrics-expansion.md

annakrystalli

Thanks for putting this together @nickreich , having a single overview of the full project scope across all the workstreams is really valuable.

I have some high-level structural comments in addition to my inline comments.

1. Remove AGENTS.md

We use Claude Code interactively rather than autonomous agents, we just point it at the relevant document for context. This file duplicates content from the poster (pipeline architecture, sprint structure, design decisions, open questions, key files), creating two documents to review and keep in sync. The poster itself is sufficient.

2. Development standards section doesn't belong here

The "Development standards" section (issue refinement format, TDD workflow, universal DoD) is prescribing team-wide methodology, that's a separate discussion, not something to embed in a project-specific poster. If we agree on a standard workflow, it should live in a team-level contributing guide (or even a Claude Code skill) where it's discoverable and reusable. As it stands, it's also something we haven't discussed as a team.

3. Level of detail

The poster has a lot of implementation-level detail (per-issue DoD checklists, file-level change specs, validation behaviour specifics) that goes beyond what's expected in a high-level project poster. A poster should capture the problem space, workstream overview, sequencing, and risks, the implementation detail belongs in issues in the relevant repos.

4. Consistency with existing planning work

Some of the Sprint C planning was already done in detail in hubPredEvalsData#34, and restating it here has introduced some inconsistencies:

The Sprint C DoD says transform: null for opt-out but the issue's schema uses transform: false, these have different semantics.
The poster says config applying a transform to a pmf target "fails validation," but the issue specifies a two-tier approach (error if explicitly set, warn if inherited from defaults).

The Sprint D section also introduces joint_across as a new config property name but the underlying hubEvals parameter is compound_taskid_set, which is also what we use in tasks.json. These are actually opposite concepts, so I'd suggest using compound_taskid_set consistently to avoid confusion.

Sprint C should link directly to hubPredEvalsData#34.

The README fix looks good.

matthewcornell · 2026-03-03T18:03:55Z

This is super helpful, @nickreich .

Overall I agree with @annakrystalli 's points.

WRT the impact on the UI component and other areas I've been contributing to, I think I'd need to sit down and go over some concrete examples for me to understand the changes. As I said above, I haven't worked with the scoring data in the UI, just interface stuff (if that makes sense).

Re: "predevals JS changes":

When the scale column is present in scores data, treat each (metric × scale) combination as a distinct metric: e.g., "wis (natural)" and "wis (log)" appear as separate items in dropdowns and as separate columns in tables

Are we saying that these are a kind of virtual/dynamic score that has to be added at UI time, rather than being generated as separate columns. If so, this would make me nervous.

It would help me if we could review the changes I'll be responsible for together in detail so I can understand the implications before we move too far along.

lshandross

I don't have much to add to the others' main comments, but I agree that the agents file should be removed and we don't need developer standards in the poster

project-posters/eval-metrics-expansion/eval-metrics-expansion.md

nickreich · 2026-03-04T04:13:22Z

From @matthewcornell

Are we saying that these are a kind of virtual/dynamic score that has to be added at UI time, rather than being generated as separate columns. If so, this would make me nervous.

No. All scores will be computed and generated as separate columns beforehand. No UI computation.

- Remove AGENTS.md (duplicated poster content); migrate pipeline diagram and repo links into the poster's What do we already know section - Remove Development Standards section (team methodology, not project scope) - Trim per-sprint DoD checklists to brief acceptance criteria - Slim down Sprint C to reference hubPredEvalsData#34 as the authoritative implementation plan, eliminating duplicated/conflicting detail - Fix transform: null to transform: false (matching hubPredEvalsData#34) - Replace joint_across with compound_taskid_set throughout (matching established hubverse terminology) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

matthewcornell · 2026-03-05T19:36:32Z

Per a brief conversation today w/@nickreich , here's the plan we came up with when we made our last hubPredEvalsData schema change (adding the rounds_idx config property):

[deploy rounds_idx config property changes #32] > Here's the plan: ...

project-posters/eval-metrics-expansion/eval-metrics-expansion.md

matthewcornell

I think it looks good. Thanks, Nick.

project-posters/eval-metrics-expansion/eval-metrics-expansion.md

Co-authored-by: Nicholas G Reich <nick@umass.edu>

seabbs reviewed Mar 3, 2026

View reviewed changes

project-posters/eval-metrics-expansion/AGENTS.md Outdated Show resolved Hide resolved

seabbs reviewed Mar 3, 2026

View reviewed changes

project-posters/eval-metrics-expansion/AGENTS.md Outdated Show resolved Hide resolved

seabbs reviewed Mar 3, 2026

View reviewed changes

project-posters/eval-metrics-expansion/AGENTS.md Outdated Show resolved Hide resolved

seabbs approved these changes Mar 3, 2026

View reviewed changes

project-posters/eval-metrics-expansion/eval-metrics-expansion.md Outdated Show resolved Hide resolved

project-posters/eval-metrics-expansion/eval-metrics-expansion.md Outdated Show resolved Hide resolved

annakrystalli requested changes Mar 3, 2026

View reviewed changes

annakrystalli reviewed Mar 3, 2026

View reviewed changes

lshandross reviewed Mar 3, 2026

View reviewed changes

project-posters/eval-metrics-expansion/eval-metrics-expansion.md Outdated Show resolved Hide resolved

nickreich added 5 commits March 5, 2026 16:29

update based on re-org of some of the tasks

4c38e09

minor manual updates to project poster

dc9e815

adding DS_Store to gitignores.

608c4ec

Merge branch 'main' into ngr/poster/eval-metrics-expansion

3adc8c8

one other minor change in response to comments.

585deae

annakrystalli mentioned this pull request Mar 10, 2026

RFC: Streamlining dashboard release and deployment #35

Open

updating to reflect recent conversations

87df936

nickreich requested review from annakrystalli, lshandross and matthewcornell March 23, 2026 15:46

matthewcornell reviewed Mar 23, 2026

View reviewed changes

project-posters/eval-metrics-expansion/eval-metrics-expansion.md Outdated Show resolved Hide resolved

matthewcornell approved these changes Mar 23, 2026

View reviewed changes

nickreich commented Mar 23, 2026

View reviewed changes

project-posters/eval-metrics-expansion/eval-metrics-expansion.md Outdated Show resolved Hide resolved

Apply suggestions from code review

704c7e2

Co-authored-by: Nicholas G Reich <nick@umass.edu>

Conversation

nickreich commented Feb 27, 2026

Summary

Project poster (project-posters/eval-metrics-expansion/)

README fix

Review timeline

Uh oh!

Uh oh!

Uh oh!

Uh oh!

seabbs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

annakrystalli Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

matthewcornell Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

nickreich Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

nickreich Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

annakrystalli left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

matthewcornell commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lshandross left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nickreich commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

matthewcornell commented Mar 5, 2026

Uh oh!

Uh oh!

matthewcornell left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Project poster (`project-posters/eval-metrics-expansion/`)

nickreich Mar 6, 2026 •

edited

Loading

annakrystalli left a comment •

edited

Loading

matthewcornell commented Mar 3, 2026 •

edited

Loading

lshandross left a comment •

edited

Loading

nickreich commented Mar 4, 2026 •

edited

Loading